AITopics | stochastic gradient descent method

Collaborating Authors

stochastic gradient descent method

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Trading Computation for Communication: Distributed Stochastic Dual Coordinate Ascent

Neural Information Processing SystemsSep-30-2025, 12:58:12 GMT

We present and study a distributed optimization algorithm by employing a stochastic dual coordinate ascent method. Stochastic dual coordinate ascent methods enjoy strong theoretical guarantees and often have better performances than stochastic gradient descent methods in optimizing regularized loss minimization problems. It still lacks of efforts in studying them in a distributed framework. We make a progress along the line by presenting a distributed stochastic dual coordinate ascent algorithm in a star network, with an analysis of the tradeoff between computation and communication. We verify our analysis by experiments on real data sets. Moreover, we compare the proposed algorithm with distributed stochastic gradient descent methods and distributed alternating direction methods of multipliers for optimizing SVMs in the same distributed framework, and observe competitive performances.

communication, stochastic dual coordinate ascent, trading computation, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.86)

Add feedback

First and Second Order Approximations to Stochastic Gradient Descent Methods with Momentum Terms

Lu, Eric

arXiv.org Artificial IntelligenceApr-22-2025

Stochastic Gradient Descent (SGD) methods see many uses in optimization problems. Modifications to the algorithm, such as momentum-based SGD methods have been known to produce better results in certain cases. Much of this, however, is due to empirical information rather than rigorous proof. While the dynamics of gradient descent methods can be studied through continuous approximations, existing works only cover scenarios with constant learning rates or SGD without momentum terms. We present approximation results under weak assumptions for SGD that allow learning rates and momentum parameters to vary with respect to time.

artificial intelligence, machine learning, theorem 2, (14 more...)

arXiv.org Artificial Intelligence

2504.13992

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

A Local Convergence Theory for the Stochastic Gradient Descent Method in Non-Convex Optimization With Non-isolated Local Minima

Ko, Taehee, Li, Xiantao

arXiv.org Machine LearningMay-30-2022

Loss functions with non-isolated minima have emerged in several machine learning problems, creating a gap between theory and practice. In this paper, we formulate a new type of local convexity condition that is suitable to describe the behavior of loss functions near non-isolated minima. We show that such condition is general enough to encompass many existing conditions. In addition we study the local convergence of the SGD under this mild condition by adopting the notion of stochastic stability. The corresponding concentration inequalities from the convergence analysis help to interpret the empirical observation from some practical training results.

artificial intelligence, machine learning, stochastic gradient descent method, (3 more...)

arXiv.org Machine Learning

2203.10973

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.85)

Add feedback

On a Variance Reduction Correction of the Temporal Difference for Policy Evaluation in the Stochastic Continuous Setting

Kobeissi, Ziad, Bach, Francis

arXiv.org Artificial IntelligenceFeb-16-2022

This paper deals with solving continuous time, state and action optimization problems in stochastic settings, using reinforcement learning algorithms, and considers the policy evaluation process. We prove that standard learning algorithms based on the discretized temporal difference are doomed to fail when the time discretization tends to zero, because of the stochastic part. We propose a variance-reduction correction of the temporal difference, leading to new learning algorithms that are stable with respect to vanishing time steps. This allows us to give theoretical guarantees of convergence of our algorithms to the solutions of continuous stochastic optimization problems.

algorithm, temporal difference, theorem 3, (16 more...)

arXiv.org Artificial Intelligence

2202.0796

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Trading Computation for Communication: Distributed Stochastic Dual Coordinate Ascent

Yang, Tianbao

Neural Information Processing SystemsFeb-14-2020, 15:11:45 GMT

communication, stochastic dual coordinate ascent, trading computation, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback

Projected Semi-Stochastic Gradient Descent Method with Mini-Batch Scheme under Weak Strong Convexity Assumption

Liu, Jie, Takac, Martin

arXiv.org Machine LearningMay-4-2017

We propose a projected semi-stochastic gradient descent method with mini-batch for improving both the theoretical complexity and practical performance of the general stochastic gradient descent method (SGD). We are able to prove linear convergence under weak strong convexity assumption. This requires no strong convexity assumption for minimizing the sum of smooth convex functions subject to a compact polyhedral set, which remains popular across machine learning community. Our PS2GD preserves the low-cost per iteration and high optimization accuracy via stochastic gradient variance-reduced technique, and admits a simple parallel implementation with mini-batches. Moreover, PS2GD is also applicable to dual problem of SVM with hinge loss.

artificial intelligence, machine learning, mini, (12 more...)

arXiv.org Machine Learning

1612.05356

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback